Resolving Ambiguities in Sentence Boundary Detection in Russian Spontaneous Speech
نویسنده
چکیده
The paper analyses inter-labeller agreement within manual annotations of transcribed spontaneous speech and suggests a way to resolve ambiguities in expert labelling. It argues that the number of controversial sentence boundaries may be reduced if some of them are regarded as “zones”. We describe a technique of detecting these zones and analyse which syntactic structures are the most likely to appear in them. Though the approach is based on Russian language material, it may be applied to oral texts in other languages.
منابع مشابه
Dependency structure analysis and sentence boundary detection in spontaneous Japanese
This paper addresses automatic detection of dependencies between Japanese phrasal units called bunsetsus, and sentence boundaries in a spontaneous speech corpus. In spontaneous speech, the biggest problem with dependency structure analysis is that sentence boundaries are ambiguous. In this paper, we propose two methods for improving the accuracy of sentence boundary detection in spontaneous Jap...
متن کاملSentence boundaries in text and pauses in speech: Correlation or confrontation?
The paper explores the interaction between sentence boundaries marked by annotators in transcriptions of Russian spontaneous speech and actual prosodic boundaries in the signal. The aim of the research is to investigate whether annotators’ prosodic competence allows them to correctly detect sentence boundaries in speech based on textual information only. We found that inter-annotator agreement ...
متن کاملSentence boundary detection of spontaneous Japanese using statistical language model and support vector machines
This paper presents two different approaches utilizing statistical language model (SLM) and support vector machines (SVM) for sentence boundary detection of spontaneous Japanese. In the SLM-based approach, linguistic likelihoods and occurrence of pause are used to determine sentence boundaries. To suppress false alarms, heuristic patterns of end-of-sentence expressions are also incorporated. On...
متن کاملMethods for Partial Sentence Recognition and Unknown Words Detection by Sentence Spotting on Continuous Speech
Spontaneous speech includes many sentences that fall outside the task domain. Furthermore, the boundary between sentences is often unclear in spontaneous speech because of the likes of corrections, stammering or overlap with the next utterance. We previously developed a sentence spotting system that uses Vector ContinuousDynamic Programming (VCDP). This system works well for sentence spotting i...
متن کاملSession 5aSC: Speech Communication 5aSC26. The role of fundamental frequency and temporal envelope in processing sentences with temporary syntactic ambiguities
While natural speech prosody facilitates sentence processing, unnatural or misleading prosody decreases speed and accuracy in resolving syntactic ambiguities. This study investigated how, and to what extent, fundamental frequency (F0) and the temporal envelope (E) contribute to processing gardenpath sentences that provide misleading grammatical interpretations. Signal processing methods degrade...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013